统计机器学习
葡萄酒品种推断
通过物理化学手段测得的葡萄酒的一些成分分析,进而进行葡萄酒种类识别。数据来源是,葡萄来源于意大利同一产地,但是酿造品种略有差异。 通过化学方法测定3个品种中都共有的13个成分数值,进行葡萄酒分类。
原始数据来源是由 Forina, M. et al, PARVUS - An Extendible Package for Data Exploration, Classification and Correlation. Institute of Pharmaceutical and Food Analysis and Technologies, Via Brigata Salerno, 16147 Genoa, Italy. 提供。
主要特征包括
- a)葡萄酒种类
- b)苹果酸
- c)矿物质
- d)矿物质碱性
- e) 镁含量
- f)总酚类
- g) 黄酮类化合物
- h)非烷酚类
- i)原青花素
- j)颜色强度
- k)色调
- j)稀释葡萄酒OD280/OD315
- j)脯氨酸
由于样本较少,这里采用MKNN方法进行分类测试,结果如下
$ go run mknn.go the 1 object predicion class is 1 , the real object is 1 the 2 object predicion class is 1 , the real object is 1 the 3 object predicion class is 1 , the real object is 1 the 4 object predicion class is 1 , the real object is 1 the 5 object predicion class is 1 , the real object is 1 the 6 object predicion class is 1 , the real object is 1 the 7 object predicion class is 1 , the real object is 1 the 8 object predicion class is 1 , the real object is 1 the 9 object predicion class is 1 , the real object is 1 the 10 object predicion class is 1 , the real object is 1 the 11 object predicion class is 2 , the real object is 2 the 12 object predicion class is 2 , the real object is 2 the 13 object predicion class is 1 , the real object is 2 the 14 object predicion class is 2 , the real object is 2 the 15 object predicion class is 1 , the real object is 2 the 16 object predicion class is 2 , the real object is 2 the 17 object predicion class is 2 , the real object is 2 the 18 object predicion class is 2 , the real object is 2 the 19 object predicion class is 2 , the real object is 2 the 20 object predicion class is 2 , the real object is 2 the 21 object predicion class is 2 , the real object is 2 the 22 object predicion class is 2 , the real object is 3 the 23 object predicion class is 3 , the real object is 3 the 24 object predicion class is 3 , the real object is 3 the 25 object predicion class is 3 , the real object is 3 the 26 object predicion class is 2 , the real object is 3 the 27 object predicion class is 3 , the real object is 3 the 28 object predicion class is 3 , the real object is 3 the 29 object predicion class is 3 , the real object is 3 the 30 object predicion class is 3 , the real object is 3 the 31 object predicion class is 3 , the real object is 3 The prediction accuracy is 0.870968采用naive bayes进行测试,分类结果如下
go run naivebayes.go the 1 object predicion class is 2 , the real object is 1 the 2 object predicion class is 2 , the real object is 1 the 3 object predicion class is 2 , the real object is 1 the 4 object predicion class is 1 , the real object is 1 the 5 object predicion class is 2 , the real object is 1 the 6 object predicion class is 2 , the real object is 1 the 7 object predicion class is 2 , the real object is 1 the 8 object predicion class is 2 , the real object is 1 the 9 object predicion class is 2 , the real object is 1 the 10 object predicion class is 2 , the real object is 1 the 11 object predicion class is 2 , the real object is 2 the 12 object predicion class is 2 , the real object is 2 the 13 object predicion class is 2 , the real object is 2 the 14 object predicion class is 2 , the real object is 2 the 15 object predicion class is 2 , the real object is 2 the 16 object predicion class is 2 , the real object is 2 the 17 object predicion class is 2 , the real object is 2 the 18 object predicion class is 2 , the real object is 2 the 19 object predicion class is 2 , the real object is 2 the 20 object predicion class is 2 , the real object is 2 the 21 object predicion class is 2 , the real object is 2 the 22 object predicion class is 2 , the real object is 3 the 23 object predicion class is 3 , the real object is 3 the 24 object predicion class is 3 , the real object is 3 the 25 object predicion class is 3 , the real object is 3 the 26 object predicion class is 3 , the real object is 3 the 27 object predicion class is 3 , the real object is 3 the 28 object predicion class is 3 , the real object is 3 the 29 object predicion class is 3 , the real object is 3 the 30 object predicion class is 3 , the real object is 3 the 31 object predicion class is 3 , the real object is 3 The prediction accuracy is 0.677419
学习算法 | 正确率 |
---|---|
NKNN | 87.09% |
NAIVE BAYSE | 67.74% |
在采用naive bayes进行分类时,发现有不少产地1的葡萄酒被分类为产地2。 国外有研究人员采用以上方法则达到了QDA 99.4%, LDA 98.9%, 1NN 96.1%。 所以这里还有优化的空间。暂时未去查原因,不知道是不是程序本身读数据的问题。